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ABSTRACT 

TRF1 and TRF2 are key proteins in human telo- 
meres, which, despite their similarities, have differ- 
ent behaviors upon DNA binding. Previous work has 
shown that unlike TRF1, TRF2 condenses telomeric, 
thus creating consequential negative torsion on the 
adjacent DNA, a property that is thought to lead to 
the stimulation of single-strand invasion and was 
proposed to favor telomeric DNA looping. In this 
report, we show that these activities, originating 
from the central TRFH domain of TRF2, are also dis- 
played by the TRFH domain of TRF1 but are re- 
pressed in the full-length protein by the presence 
of an acidic domain at the N-terminus. Strikingly, a 
similar repression is observed on TRF2 through the 
binding of a TERRA-like RNA molecule to the 
N-terminus of TRF2. Phylogenetic and biochemical 
studies suggest that the N-terminal domains of TRF 
proteins originate from a gradual extension of the 
coding sequences of a duplicated ancestral gene 
with a consequential progressive alteration of the 



biochemical properties of these proteins. Overall, 
these data suggest that the N-termini of TRF1 and 
TRF2 have evolved to finely regulate their ability to 
condense DNA. 

INTRODUCTION 

Telomeres are specialized structures protecting the natural 
termini of linear chromosomes from degradation and 
illicit repair (1). They are assembled through associations 
between telomeric DNA, a TTAGGG repeat containing 
sequence that ends with a single stranded 3' overhang and 
telomere-specific proteins. The transcription of telomeric 
DNA produces a UUAGGG repeat containing RNA, 
TERRA, which is anticipated to play fundamental roles 
in telomere biology (2). 

Several protein complexes associate with telomeric 
DNA. Among these, the shelterin complex in mammals 
involves six proteins (TRF1, TRF2, RAP1, TIN2, TPP1 
and POT1) (3,4). Binding of this complex to DNA 
is mediated by the two double-stranded DNA (dsDNA)- 
binding proteins TRF1 (5,6) and TRF2 (7,8) and 
the G-tail binding protein POT1 (9). The other three 
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proteins do not interact directly with DNA (10-14). Other 
complexes containing fewer members of the shelterin 
complex have also been described (15). Telomeres have 
the capacity to fold into t-loops, lasso-like structures 
that have been observed on purified telomeres of diverse 
origins (16-20). In vitro, formation of t-loops depends on 
TRF2 and has been proposed to involve the cw-oriented 
invasion of the telomeric dsDNA by the G tail (17,21). 
One interesting concept that emerged from recent bio- 
chemical studies on TRF2 is that its biological functions 
seem closely driven by its intrinsic properties. Its capacity 
to bind telomeric DNA ends (22,23) has been implicated 
in the inhibition of the non-homologous end-joining 
pathway (24). Its N-terminal domain binds the center 
of Holliday junctions, protecting them from resolvase 
cleavage (25), a property possibly explaining TRF2's 
ability to protect the t-loops from resolution. TRF2 has 
also been shown to stimulate the invasion of a telomeric 
single-strand inside a homologous duplex in free DNA 
(26) and in the context of the nucleosomal fiber (27), a 
process thought to participate in the formation of t-loops 
and chromatin loops. This stimulation was proposed to 
result from a change in topology generated by the conden- 
sation of the DNA. This latter property was shown to pri- 
marily involve the C-terminal Telobox/Myb-like and the 
homodimerization TRFH domains of the protein (26). 
Despite the close resemblance between these domains in 
TRF1 and TRF2, TRF1 exhibits a very different behavior 
towards DNA. In this study, we investigated this paradox 
and show that TRF1 and TRF2 are not as different as 
hitherto assumed. Through studies of different mutants 
and chimeras of these proteins, we uncover the important 
regulatory role played by the N-termini in the biochemical 
properties of these proteins. 

MATERIALS AND METHODS 

Plasmids and oligonucleotides 

pTelo2 and PLTelo are both pUC19-based plasmid con- 
taining 650 bp of human telomeric repeats. 

Proteins 

Cloning of protein genes are described in Supplementary 
Data. TRF proteins and mutants were: TRF2 (3-500), 
TRF2 AB (47-500), TRF2 AD (3-44, 244-500),TRF2 ABAD 
(244-500), TRF2 AL (3-248, 445-500), TRF2 AD TRF 1 D 
(TRF2 3-44 TRF1 65-264 TRF2 244-500), TRF1 
(2-439), TRF2 hAAB (hTRFl 2-67 TRF2 47-500), 
TRF2 mAAB (mTRF 12-63 TRF2 47-500), TRF2 cAAB 
(cTRFl 2-27 TRF2 47-500) and TRF1 AA (65-248). All 
proteins were fused to a 6-histidines tag and produced as 
published, a gel filtration chromatography step was added 
when necessary (26). A Coomassie blue-stained sodium 
dodecyl sulfate polyacrylamide gel electrophoresis (SDS- 
PAGE) gel of the wild type and mutant proteins is shown 
in Supplementary Figure S1A. Un-tagged TRF2 protein 
was also used in a control topology experiment 
(see below). In that case the tagged protein was cleaved 
using Tabacco Etch virus Protease I, the Histidine tag 



removed by Ni-column chromatography followed by 
two chromatographic steps (heparin and gel filtration). 

Strand Invasion assay, electrophoretic mobility shift assay 
and topology assays 

Strand invasion assay, electrophoretic mobility shift assay 
(EMSA) and topology assays were performed as 
published (26). A control topology using un-tagged 
TRF2 protein was performed to control the absence of 
effect of the Histidine Tag (Supplementary Figure SIB). 

Surface plasmon resonance (SPR)-based invasion assay 

The experiments were performed on a Biacore T100 (GE 
Healthcare) using CI sensorchips (GE Healthcare). 
Binding of the T15G probe and the R100 control 
(a non-telomeric random 100 bases oligonucleotide, 
Supplementary Data) was performed via a streptavidin- 
biotin interaction in conditions recommended. A signal of 
150 RU was achieved for both the captured telomeric 
(T15G) and control (R100) oligonucleotides. pTelo2 
(50 nM) and proteins (200 nM) were pre-incubated in 
HBS-ET buffer (10 mM HEPES pH 7.4, 300 mM NaCl, 
3mM EDTA, 0.05% Tween 20) during 15min at room 
temperature. Samples were injected at lOul/min and the 
washing step performed using HBS-ET buffer. Surfaces 
were regenerated by sequential injections of water, 1 M 
NaCl, and 0.1% SDS during 30 s. Each sensorgram was 
corrected by subtraction of the signal obtained from the 
control flow cell (R100). 

Atomic force microscopy imaging 

DNA was incubated with proteins for 20min at 25° C in 
5mM HEPES pH 7.4, 150mM KC1. The protein/DNA 
molar ratios used: 10/10 for TRF1, TRF2, TRF2 hAAB 
and TRF2 AL ; 3/6 for TRF1 AA and TRF2 AB ; 6/9 for 
TRF2 mAAB ; 10/9 for TRF2 cAAB . Samples were 
crosslinked with glutaraldehyde (0.1% final) for 30min 
on ice and applied on freshly cleaved mica surfaces 
treated with lOmM MgCl 2 . After 2min, mica was 
washed with deionized water and dried. Imaging was per- 
formed on a Nanoscope Ilia equipped with E-scanner 
(Digital Instruments Inc., Santa Barbara, CA, USA), in 
air under Tapping Mode using silicon tips. Images were 
recorded at 1.5-2.0 Hz over scan areas 1 um wide 
(512x512 pixels). Raw scanning force microscopy 
(SFM) images were flattened using the manufacturer's 
software and converted into TIF files. Contour lengths 
(CLs) were measured by the read-through length method 
using SigmaScan Pro software (SPSS Inc., Chicago, IL, 
USA). Volumes calculated as half-ellipsoids as published 
(26). Between 150 and 300 objects were scored for 
each condition. Detailed information on the construc- 
tion of the 2D probability density maps are given in 
Supplementary Data. 

Phylogenetic studies 

Pblast. The Telobox sequence from the human TRF1 and 
TRF2 was blasted against the NCBI protein database. 
The resulting alignment was used to generate the 



2568 Nucleic Acids Research, 2012, Vol. 40, No. 6 



PhyML tree. We downloaded the protein families denned 
in the Ensembl database version 56 (as of September 2009) 
(www.ensembl.org/) ENSFM00250000004074 and 
ENSFM00250000007334 and also added sequences from 
the NCBI database (www.ncbi.nlm.nih.gov/). All align- 
ments were manually verified using Sea View to exclude 
redundant and improperly annotated sequences (28). We 
downloaded cDNAs from NCBI and also checked ESTs 
data. The best alignment carried out using ClustalW and 
Muscle (29,30) was considered and manually refined. All 
nucleotidic sequences retrieved were translated before our 
phylogenetic analyses. 

RESULTS 

The acidic domain of TRF1 inhibits its ability to condense 
DNA 

The TRFH domain of TRF2 (here called D domain, 
Figure 1A) plays a critical role in the ability of TRF2 to 
condense DNA and to stimulate telomeric invasion (26). 
In view of the structural homology between these D 
domains in TRF1 and TRF2, it was therefore surprising 
to observe that TRF1 inefficiently condensed DNA. Two 
hypotheses could explain this difference: (i) the TRF1 and 
TRF2 D domains could be functionally different or (ii) the 
D domain of TRF1 could also be capable of DNA con- 
densation but this property is inhibited in the full-length 
protein. To investigate this question, we have constructed 
several mutants of TRF2 (Figure 1A, Supplementary 
Figure SI A) and analyzed their ability to condense 
DNA using a topology assay and atomic force microscopy 
(AFM) (26). The former assay is based on the analysis of 
the topology of DNA by gel electrophoresis after incuba- 
tion with the protein in the presence of wheat germ topo- 
isomerase I. This enzyme can remove the supercoils, 
located in the unbound part of a closed DNA molecule, 
generated by the binding on this DNA of a topologically 
active protein. Modification of DNA topology caused by 
this type of proteins can be visualized through the appear- 
ance of topoisomers on a gel (Figure IB). This experiment 
allows the measurement of the average number of super- 
coils created (Figure IB, right) and these turns can be 
characterized as being positive or negative by using 
chloroquine in the experiment. Indeed, this drug increases 
the rate of migration of positively supercoiled DNA 
and conversely decreases this rate for negatively super- 
coiled DNA compared to controls (Supplementary 
Figure SIC). AFM allows the direct visualization of the 
DNA-protein complexes and the measure of both the 
contour length (CL) of the DNA and the volume of 
the complexes (Figure 1C). From these numbers, 
color-coded 2D-probability density maps were drawn 
showing the probability (p(x,y)) of a given complex to 
have a volume x and a y DNA CL. As shown in 
Figure 1C, most of the TRF1 complexes exhibit a small 
volume and a long DNA CL, implying a lack of DNA 
condensation. This is confirmed by the topology assay 
showing that TRF1 inefficiently creates supercoils in 
DNA (Figure IB). In contrast, TRF2 causes a significant 
decrease in DNA CL (Figure 1C), thus showing DNA 



condensation. TRF2 creates positive supercoils (Figure 
IB and Supplementary Figure SI) that can also be 
observed by AFM imaging of the DNA resulting from 
the topology assay (Supplementary Figure S2). 
Comparison of the different mutants schematically 
shown in Figure 1A reveals several key points: (i) 
deleting the acidic domain of TRF1 (TRF1 AA ) greatly 
increases TRFl's ability to condense DNA, thus trans- 
forming it into a TRF2 AB -like protein; (ii) adding the 
A domain of TRF1 on TRF2 (TRF2 hAAB ) does the 
reverse, transforming TRF2 in a TRFl-like protein; 
(iii) the linker or hinge domain that separates the D 
domain and the Myb-like domains has little role in this 
function (TRF2 AL ); (iv) the B domain of TRF2 seems to 
stimulate TRF2 ability to modify DNA topology 
(TRF2 AB ); (v) the TRFH domain of TRF2 is required 
(TRF2 AD TRF2 ABAD ); and (vi) the TRFH domains of 
TRF1 and TRF2 seem interchangeable, since the 
chimeric protein TRF2 AD TRF1 D is very efficient in mod- 
ifying DNA topology. 

In summary, both TRF1 and TRF2 have the ability to 
condense DNA, but the N-terminal acidic domain of 
TRF1 prevents this condensation. 

TRF2 dramatically increases the rate of telomeric strand 
invasion 

As a consequence of this change of DNA topology, TRF2 
was shown to increase the invasion of telomeric double 
strand by an homologous single-stranded probe (26). 
In order to verify that the acidic domain could also be 
responsible for the lower efficiency of TRF1 in telomeric 
invasion, we analyzed the invasion activity using the 
pTelo2 telomeric plasmid and a single-stranded probe 
containing 15 TTAGGG telomeric repeats (T15G) with 
two different assays: an invasion assay based on gel elec- 
trophoresis thus measuring the association at steady state; 
an assay using SPR technology to monitor the same 
association in real-time. In the latter experiments, we 
measured the changes in refractive index (in arbitrary 
units RU) due to the binding of the pTelo2 plasmid on 
the T15G single-stranded probe immobilized on the chip 
(31). An SPR sensorgram presents two phases 
(Supplementary Figure S3A): injection, when the analyte 
is continuously injected and washing to follow dissoci- 
ation. First, the samples were injected until the signal 
reached a plateau. With pTelo2 alone the response was 
very weak (RU max around 10) and reached a plateau 
in about 1 1 h. However, in the presence of TRF2, the 
signal reached the plateau in about 15min for a 
response 60 times higher. This suggests that TRF2 dra- 
matically increases the association rate. Then, we per- 
formed comparative analysis between TRF1 and TRF2 
on different substrates. Samples were injected for only 
3 min and the measure performed 1 min after the injection 
stop (measure point in Supplementary Figure S3B). 
Results are presented as histograms of the corresponding 
measure points. The response obtained in the presence of 
TRF2 was 44 times higher than the one obtained with 
pTelo2 alone and therefore corresponding to a stimulation 
of invasion of 44-fold. This increase could not be due to 
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Figure 1. The acidic domain of TRF1 prevents DNA condensation and also inhibits the modification of DNA topology. (A) Schematic represen- 
tation of the TRF1 and TRF2 wild type and mutant proteins. (B) Topology assay performed with the telomeric plasmid pLTelo and 1 uM of each 
protein using wheat germ topoisomerase I (WG Topo I). Samples were analyzed by TBE agarose gel electrophoresis. SC stands for supercoiled 
plasmid, RC for relaxed circular, N for nicked. The plot on the right panel shows the average number of supercoils created by the proteins calculated 
from gels such as the one shown in (B). Error bars show standard deviation. (C) 2D-probability density maps of volumes and DNA contour length 
(CL) measured by AFM for the complexes formed with TRF1, TRF2 hAAB , TRF1 , TRF2 AB and TRF2. The red and green crosses indicate the 
position in the graph of the majority of TRF1 and TRF2 complexes, respectively. 
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the binding of TRF2 on T15G since no change in SPR 
response could be seen when using pUC19 (Figure 2B). 
The dissociation phase was slow, suggesting that the 
major effect of the protein was on the association rate. 
TRF1 had a much lower effect than TRF2 (<10%, 
Supplementary Figure S3C and G). Gel-based assays per- 
formed with the same proteins (Supplementary Figure 
S3D, E and G) give the same trend. As expected, the 
effect measured for TRF2 was higher in SPR since, in 
this experiment, even unstable complexes would be 
scored. As seen in the invasion assay (26), the TRF2 
effect was more potent on a closed supercoiled plasmid 
than on a linear molecule (Supplementary Figure S3F), 
showing the importance of topological constraints. 
Overall, data obtained with the two methods are in good 
agreement which suggests that both experiments monitor 
the same phenomenon. Furthermore, we can say that the 
main effect of TRF2 on telomeric invasion is to greatly 
increase the association rate of the double-stranded target 
with the invading single strand. 

The acidic domain of TRF1 inhibits its ability to promote 
strand invasion 

Data obtained for a large panel of mutants clearly show a 
remarkable correlation between the ability to condense 
and to promote strand invasion (Figure 2). Proteins inef- 
ficient in condensing DNA (TRF1, TRF2^ AAB ) were also 
poor in stimulating invasion in both gel-based and SPR 
assays. Conversely, efficient proteins in DNA condensa- 
tion (TRF2, TRF1 AA , TRF2 AB and TRF2 AD TRF1 D ) 
were found to be active in strand invasion. In summary, 
we clearly establish that TRF proteins have the inherent 
capacity of condensing DNA, modifying DNA topology 
and stimulating invasion but this intrinsic property is 
nearly lost in TRF1 through the presence of an acidic 
domain at its N-terminus while in TRF2 it is stimulated 
by the presence of a basic domain. This prompted us to 
investigate how these domains evolved. 

Evolution of the N-terminal domains of TRF1 and TRF2 
in the vertebrate's lineage 

As telomeric DNA is highly conserved, we began our 
analysis by blasting (Pblast) the human Telobox DNA- 
binding domain against Metazoans. As expected (5,7), 
we found that TRF Telobox sequences form a distinct 
monophyletic group (Supplementary Figure S4). 
Moreover, it is possible to distinguish a signature differen- 
tiating TRF1 from TRF2 (Supplementary Figure S5). 
Studies of the available genomes of Prochordates, the 
Urochordates Ciona intestinalis and Ciona savignyi, and 
the Cephalochordate Amphioxus, Branchiostoma 
floridae, reveal a single copy of the telobox with sequences 
highly similar to that of the two human TRFs. Of note, 
the genome of the lamprey (Petromyzon marinus), one of 
the most basal organisms that split off from the 
Gnathostomes about 540 MYA ago, shows the two 
forms of telobox. Altogether, these results are in favor 
of a duplication of a unique ancestral gene during one 
of the two rounds of genome duplications (32) that 
occurred early in the Chordate lineage. 
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Figure 2. TRF1 can stimulate invasion but is prevented to do so by its 
N-terminal acidic domain. (A) Invasion assays performed with super- 
coiled pTelo2 and increasing concentrations (10, 30, 100, 300 nM and 
luM) of TRF1, TRF2, TRF2 AD TRF 1 D , TRF1 AA , TRF2 AB and 
TRF2 hAAB . (B) Histogram of the SPR values obtained after injection 
of a control plasmid without telomeric repeats (pUC19, 50 nM) with or 
without pre-incubation with TRF2 (200 nM). (C) Histogram of the 
SPR values obtained after injection of supercoiled pTelo2 (50 nM) 
with or without pre-incubation with TRF1, TRF2, TRF2 AD TRF 1 D , 
TRF1 AA and TRF2 hAAB (200 nM). Error bars show standard devi- 
ation. (D) Summary table showing, for each protein, the concentration 
necessary for binding half the quantity of a double-stranded DNA in a 
standard EMSA (binding capacity), the relative stimulation of invasion 
calculated at maximum effect and the relative increase in SPR response 
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Outside Eutherians, only a few N-terminal sequences 
are available (Figure 3A, accession numbers in 
Supplementary Figure S6). However, some key species 
allow the plotting of a plausible scenario for the evolution 
of these domains. The TRF2 B domain is very short in 



Nucleic Acids Research, 2012, Vol. 40, No. 6 2571 



TRF1_HUMAN MAEDVSSAAP SPRGCADGRD ADPTEEQMAE TERNDEEQFE CQELLECQVQ VGAPEEEEEE EEDAGLVAEA EAVAAG 

TRFi_CHIMPANZEE MAEDVSSAAP SPRGCADGRD ADPTEKQMAE TERNDEEQFE CQELLECQVQ VGAPDEEEEE EEDAGLVAEA EAVAAG 

TRF1_0RANGUTAN lylAEDVASAAP SPRGCADGRD ADPTEEQMAE TERNDEEQFE CQELLECQVQ VGAP — EEEE E-DAGLVAEA EAVAAG 

TRF1_MACAQUE MAEDASSAAP SPRGCADGRD ADTTEEQMAE TETNDEEQFE CQELLECQVQ VGAP — EEEE EEDEGLVTEA EAVAAG 

TRF1_MARM0SET MAEDASSAAP SPRGCADGRD ADATEERIAE TERNDEEQFE CQELLECQMQ LGAS EEE EEDAGFVAEA EAVAAG 

TRF1_HAMSTER MAEDVSSTAP SPRGCADGRD ADPTEEQMAQ TQRNDQDQFE CQELLECQVQ VGAPDEEEEE EEDSGLVAEA EAVAAG 

TRF1_H0RSE MAEDSSSAAS SPRGRADGED AEPPEERAAA MARDDQEQFE CQELFECQVQ VGA? EA EEDAELVAEA ETVAAG 

TRF1_INDIAN MUNTJAK |H!DTASAAQ SPRGRADGED AGSSKDRVAD TVTDDQEQFE CQELLECAVQ PGV? EE EEDPGLVAEA EAVAAG 

TRF1_CHINE MUNT JACK MAEDTASAAQ SPRGRADGED AGSSKDRVAD TVTDDQEQFE CQELLECPVQ PGV? EE EEDPGLVAEA EAVAAG 

TRF1_C0W MAEDTASAAQ SPRGRADGED AGSPGERVAD TVTDDQEQFE CQELLDCQVQ PGVP EE EEEAGLMAEA EAVAAG 

TRF1_D0LPHIN MAESAASAAP SPRGLADGED AGPLEERMAE TARDDQGQFE CQEPLECQMH LGAP EE EEDAGLVAEA EAVAAG 

TRF1_ALPACA MAEDAASAAP SPRGRADGED AGPSEERTVE TARENQEQFE CQELLECQVQ LEAP EE EEDSGLVAEA EAVAAG 

TRF1_PIG MAEGTPSAAP SPRGRADGED AELPEEQMAE TAREDQEQFE CQELLECQVQ LGAP PE EEDAGLVAEA EAVAAG 

TRF1_D0G MAESAPSAAP SPRGCADGED AAPPEEATAE TPRDDQEQFE CQELLEYQVQ VGDP EE EEDAGAVAEA EAVAAG 

TRF1_RABBIT MAEDAASGAA SPRGRADGKD ADSPEKRMTE TPRDDREQFV CQELLECQVQ EETP EE EEDAGLVAEA EAVAAG 

TRF1_MEGABAT MAE DAAVAAP SPRGRADGQD AGPSEKRLPE AEREDQEQFE CQELLECQVE TGVP EG EEDAGLVAEA EAVAAG 

TRF1 GUINEA PIG MAEEPASASP VPRGLADGEP ADAAEPELMQ KGRDEQEQIQ CQELLDCQVE FGV? EEE EEDADLVAKA EAVAAG 

TRF1_M00SE MAETVSSAAR DAPSREGWTD SDSPEQEEVG D DAELLQCQLQ LGTP RE MENAELVAEV EAVAAG 

TRF1_RAT M&GTVTSAAP GARSNAGGTS ADSPEKEAAR D DAELFDCRVQ LGPP RE EENAELVAEA EAVAAG 

TRF1_0P0SSUM TZZZZZZ^^ MKSSREREYI RALKRQFISF EEGEEEEEEE PAPYTVDPAA DSLACG 

TRF1_CHICKEN g SEAGREREGG LVLFLPSALA EAVAAD 

TRF1_XEN0PUS LAEVIS MEE ETDGPPFDDT AAVATN 

TRF1_ZEBRAFISH MESESHEITS TSDKTTSQEV NNVVQS 

TRF2_ZEBRAFI SH - — MSDKPCEPSW EQIVNR 

TRF2_XEN0PDS LAEVIS -MESNSTLRE CGSPDPCIQL ERTINQ 

TRF2_XEN0PUS TROPI -METNSTLEE RRSPDSCKQL ERTINQ 

TRF2_CHICKEN ^^^^^^^ ^^^^.(j^ KRSRAAMEEQ EKTSTRSDDR EQAVNR 

TRF2_OPOSSUM -MPGGNSGNH DGQGRAASRR PSRRMGRPRR GRHETGLGGD GERGLGEARL EEAVNR 

TRF2_WALLABY — MPGGGESH DGHGRAASRR PARRLGRARR GRHESGLRGD GERGVGEARL EEAVNR 

TRF2_MOUSE --MAGGGGSS DSSGRAASRR ASRSGGRARR GRHEPGLGGA AERGAGEARL EEAVNR 

TRF2_RAT — MAGGGGSS DNGGRAANRR ASRSGGRARR GRHEPGLGGA AERGAGEARL EEAVNR 

TRF2_GUINEA PIG — MAGGGGSS EGSGRAGGRR TSRSSGRARR GRHESGLGGA AERGAGEARL EEAVNR 

TRF2_MEGABAT --MAGGGGSS DGSGRAASRR ASRSGGRARR GRHDPGLGGA AERGAGEARL EEAVNR 

TRF2_INDAN MUNTJAK — MAGGGGSS DSSGRAAGRR ASRSGGRARR GRHAPGLGGA AESGAGEARL EEAVNR 

TRF2_CHINE MUNTJAK — MAGGGGSS DSSGRVAGRR ASRSGGRARR GRHAPGLGGA AERGAGEARL EEAVNR 

TRF2_COW --MAGGGGSS DSSGRAAGRR ASRSGGRARR GRHAPGLGGA AERGAGEVRL EEAVNR 

TRF2_DOLPHIN --MAGGGGSS DSSGRAAGRR ASRSGGRARR GRHAPRLGGS AERGAGEARL EEAVNR 

TRF2_PIG — MAGGGGSS DSSGRAAGRR TSRSGGRARR GRHAPRLGGA AERGAGEARL EEAVNR 

TRF2_KANGAROO RAT --MAGGGGNS DRSGRAAGRR ASRSGGRARR GRHESGLGAA AERGAGDARL EEAVNR 

TRF2_ORANGUTAN --MAGGGGSS DGSGRAAGRR ASRSSGRARR GRHEPGLGGP AERGAGEARL EEAVNR 

TRF2_MARMOSET — MAGGGGSS DGSGRPAGRR ASRSSGRARR GRHEPGLGGP AERGAGEARL EEAVNR 

TRF2_CHIMPANZEE — MAGGGGSS DGSGRAAGRR ASRSSGRARR GRHEPGLGGP AERGAGEARL EEAVNR 

"RF?_HUMAN --MAGGGGSS DGSGRAAGRR ASRSSGRARR GRHEPGLGGP AERGAGEARL EEAVNR 
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Figure 3. Evolutionary analyses of TRF genes. (A) Alignment of the N-terminal sequences of TRF proteins in different species. The N-terminal 
domains were defined by alignment of the protein sequences through their TRFH domain. (B) Curves showing the variations of the length in amino 
acids of the N-terminal domain of TRF1 (red) or TRF2 (blue) and of the number of acidic and basic residues in these domains (TRF1 in orange and 
TRF2 in green) as a function of the time since speciation on the lineage leading to the Eutherians. 



zebrafish (16 residues), and increases in Xenopus, and 
even further in chicken. The gray short-tailed opossum 
has a B domain very similar to that of the placental 
mammals. For TRF1, the Xenopus sequences are 



shorter than the zebrafish domain (19 residues), but 
show a high proportion of acidic residues. The chicken 
has a longer domain (27 residues), with several acidic resi- 
dues. The opossum has also a long domain (48 residues) 
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but the Eutherians have the longest acidic domains (from 
63 to 76 residues) which are also the richest in acidic 
residues. Thus, the evolution leading to the Eutherians 
seems to correlate with a gradual increase in both length 
and acidic/basic residues composition of the N-terminus 
of TRF proteins (Figure 3B). 

Acquisition of these domains does not seem to result 
from exon addition but rather an extension of the first 
TRF exon by a continuous gain of the 5' non-coding 
sequence, since: (i) we observe a gradual increase in the 
length and (ii) base composition of the 5'UTR is strikingly 
very similar to that of the corresponding A/B domains. 
In the vicinity of the human terfl and terf2 genes, we 
calculated a striking GC content of 0.639 (5'UTR) and 
0.677 (A domain) in TRF1, and 0.708 (5'UTR) and 
0.793 (B domain) in TRF2. This suggests that the GC 
content of the 5'UTR is correlated to that of the A/B 
domain. Moreover, GC-rich nucleotidic sequences, when 
incorporated into a coding sequence, tend to yield larger 



numbers of charged residues (particularly D, E, R, S and 
G, which are hallmarks of the N-terminal domains of 
TRF1 and TRF2). 

Altogether, pending on sequences available so far, the 
A/B domains of TRF proteins seem to originate from a 
gradual gain of amino acids by successive and iterative 
incorporation of the 5'UTR in the coding sequence all 
along the lineage leading to the Eutherians. 

Importance of the length of the N-terminal acidic domain 
of TRF1 

To test the consequences of the evolution of the 
N-terminal domains on the biochemical properties of 
TRF1 and TRF2, we analyzed the capacity of the acidic 
domain from mouse and chicken to inhibit strand invasion 
and DNA condensation (Figure 4). We fused these 
N-termini to TRF2 AB (TRF2 mAAB and TRF2 cAAB for 
mouse and chicken sequences, respectively) and charac- 
terized these chimeric proteins. Mouse (63 residues) and 
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Figure 4. Importance of the length of the N-terminal acidic domain of TRF1. (A) Invasion assays performed with supercoiled pTelo2 and increasing 
concentrations (10, 30, 100, 300 nM and 1 uM) of TRF2, TRF2 AB . TRF2 hAAB , TRF2 mAAB and TRF2 l:AAB . (B) Variations of the relative stimulation 
of invasion at maximum as a function of the length of the acidic N-terminal domain. (C) 2D-probability density maps of volumes and DNA CLs 
measured by AFM for the complexes formed with TRF1, TRF2, TRF2 cAAB , TRF2 mAAB and TRF2 hAAB . The red and green crosses indicate the 
position in the graph of the majority of TRF1 and TRF2 complexes, respectively. 
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chicken domains (27 residues) are shorter than the 
human one (76 residues), but have similar isoelectric 
points (3.65, 3.73 and 4.4 for human, mouse and 
chicken, respectively). 

Invasion assays (Figure 4A and B) and AFM 
(Figure 4C) experiments show a similar trend and suggest 
a correlation between the length of the acidic domain and 
the ability to inhibit DNA condensation and to stimulate 
invasion. The longer the A domain, the less strand invasion 
and DNA condensation was observed. Our results indicate 
that evolution of TRF1 proteins might have been 
associated with a progressive inhibition of their capacity 
to condense DNA and to stimulate telomeric invasion. 

Telomeric RNA negatively regulates TRF2-mediated 
DNA condensation 

The binding of molecules to the N-termini of TRF1 and 
TRF2 is expected to have an effect on their behavior 
regarding strand invasion and DNA condensation. 
For TRF2, one possible candidate is TERRA, the telo- 
meric RNA that was recently proposed to interact with the 
N-terminal basic domain of TRF2 (33). To study this, we 
performed topoisomerase I assays and AFM experiments 
in the presence of a G-RNA containing three (UUAGGG) 
repeats and a C-RNA bearing the (CCCUAA) 3 sequence. 
As seen in Figure 5 A and B, while the C-RNA had little 
effect on TRF2-mediated modification of topology, the 
G-RNA greatly reduced the ability of TRF2 to condense 
DNA. This was not due to a reduction in the binding of 
duplex DNA since no important changes in the amount 
of DNA bound by EMSA were observed in the presence 
of either RNA (Figure 5C). In accordance with published 
results (33), direct binding of TRF2 on these RNAs 
(Figure 5D) showed that TRF2 strongly prefers the 
G-RNA to the C-RNA, which could explain their differ- 
ential effect. Of note, a faint band of C-RNA-TRF2 
complex is visible at the highest concentration of 
protein, suggesting that recognition of the C-RNA by 
TRF2 is possible. AFM experiments show that adding 
G-RNA to TRF2 prior to DNA binding decreases its 
capacity to form condensed complexes. TRF2 complexes 
strikingly resemble those obtained for TRF1 (Figure 5E). 
The C-RNA has a much weaker but not negligible effect 
probably due to the residual binding mentioned above. 

Collectively, these experiments show that the binding 
of G-RNA on TRF2 severely hinders its capacity to 
condense DNA and to modify DNA topology. 

DISCUSSION 

Our phylogenetic studies on the terfl and terf2 genes 
suggest that they originated from the duplication of an 
ancestral gene and that their N-termini gradually 
increased in size through iterative incorporation of se- 
quences from their 5'UTR. These additions had profound 
consequences on the behavior of these proteins. They 
created binding sites for several telomere-interacting mol- 
ecules (3) and conferred to TRF2 the ability to bind 
Holliday junctions and to protect them from resolvase 
cleavage (25). Here, we reveal that these domains also 



make important contributions to the ability of TRF1 
and TRF2 to condense telomeric DNA and to stimulate 
telomeric invasion. Indeed, we show that TRF1 ineffi- 
ciency in modifying topology and stimulating invasion 
can be attributed to the acidic nature of its N-terminus. 
Conversely, the basic nature of the N-terminus of TRF2 
causes an increase in these capacities for TRF2. Analyzing 
the acidic domains of TRF1 proteins from chicken, mouse 
and human, we have found that the length rather than the 
pi seems important, indicating that the functional diver- 
gence between TRF1 and TRF2 may have gradually 
increased all along the Chordate lineage. 

TRF2-mediated DNA condensation leads to the 
untwisting/unwrithing of the surrounding constrained 
DNA which is thought to stimulate telomeric invasion, a 
reaction involved in the folding of telomeric DNA into 
t-loop (26). Our data strongly reinforce this view since 
we observe a striking correlation between the capacity of 
TRF2 and its mutants to condense DNA, to modify DNA 
topology and to stimulate invasion. Furthermore, we have 
uncovered that stimulation of telomeric invasion is 
mediated by a striking increase in the kinetics of the 
invasion reaction (steady state is reached in 15min in 
the presence of TRF2, 1 1 h in its absence). 

Overall, these data suggest a role for TRF2 in the regu- 
lation of DNA topology on telomeres. In accordance, a 
recent study (34) shows that TRF2 acts together with 
topoisomerase 2 to protect telomeric DNA from replica- 
tive DNA damage. One major future challenge will be to 
elucidate the mechanism and the precise role of TRF2 in 
this process. 

One interesting idea raised by our work is that modifi- 
cation^) of the N-terminal domains of TRF1 and TRF2 
might result in a TRF1 protein exhibiting TRF2-like 
behavior or vice versa. To our knowledge, modifications 
of TRF1 N-terminal domain have not been reported so 
far. However, one cannot exclude the possibility of a regu- 
lation through the binding of a protein partner. 
A tankyrase-binding motif exists in most of the TRF1 
N-termini from mammals but tankyrases stand as poor 
candidates since mouse and rat TRFls lack this motif 
and their ultimate role is the inhibition of TRF1 DNA 
binding (3). In contrast, the N-terminal basic domain of 
TRF2 is the target of several binding partners and, 
at least, one post-translational modification. Arginines 
17 and 18 in this domain were shown to be methylated 
(35). Proteins such as the Werner protein as well as ORC1 
were shown to recognize this domain (36,37). Similarly, 
the telomeric RNA, TERRA, was recently proposed to 
interact with the N-terminus of TRF2 (33). TERRA is a 
very attractive candidate for regulating the functions of 
the N-terminus of TRF2 since it could both mask the 
positive charges of this domain and repulse DNA. To 
investigate this hypothesis we performed AFM experi- 
ments and topology assays in the presence of small 
RNA molecules containing telomeric repeats. The effect 
of the G-rich small RNA was striking with a total abroga- 
tion of the capacity of TRF2 to condense DNA (Figure 5). 
Although these RNA molecules are far shorter than 
the natural TERRA, which can reach 9kb, and lack the 
sub-telomeric sequences present in TERRA, it is tempting 
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Figure 5. Telomeric RNA negatively regulates TRF2-mediated DNA condensation. (A) Topology assay performed using the telomeric plasmid 
pLTelo and 1 uM of TRF2 in the presence of wheat germ topoisomerase I (WG Topo I) and increasing amounts of either G- or C-RNA (0.5, 1, 1.5, 
2 and 3 uM). SC stands for supercoiled plasmid, RC for relaxed circular, N for nicked. (B) Plot of data in (A) showing the variations of the average 
number of supercoils as a function of the ratio between the concentration of dimers of TRF2 and the concentration of G-RNA (light gray curve) or 
C-RNA (dark gray curve). Error bars correspond to standard deviation. (C) EMSA using 5nM of a double-stranded telomeric probe and 20 nM of 
TRF2 in the presence of increasing amounts of G- and C- RNA (10, 20, 30, 40 and 60 nM). (D) EMSA using 5nM of labeled G- and C-RNA and 
increasing amounts of TRF2 (10, 30, 50, 100 and 300 nM). (E) 2D-probability density maps of volumes and DNA CLs measured by AFM for 
complexes formed with TRF1 and TRF2 in the presence of a ratio of 3 G- and C-RNA per TRF2 dimer. The red and green crosses indicate the 
position in the graph of the majority of TRF1 and TRF2 complexes, respectively. 



to speculate that TERRA may impact TRF2 behavior 
concerning DNA topology and even t-loop formation. 
One could imagine that TERRA could guide TRF2 
complexes from a DNA-condensation/t-loop proficient 
type of complex to a more TRFl/shelterin proficient 
type. It follows that TERRA could impede t-loop 



formation, an event that could be deleterious to the cell 
but may also be a necessary step during replication. In 
accordance, TERRA has been proposed to regulate 
some aspects of telomere replication (33). TERRA could 
even be involved in the removal of the t-loop. Indeed, the 
N -terminal domain of TRF2 has been implicated in the 
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protection of the t-loop (38) and Holliday junctions (25) 
against resolution events. One could speculate that 
TERRA binding could alleviate this protection and 
cause t-loop HR. In accordance with this hypothesis, it 
has been observed that depletion in Nonsense-mediated 
decay (NMD) proteins that increase the number of 
TERRA foci in human cells causes sudden telomere loss 
and a marked increase in telomeric fragments (39). This 
would make TERRA a crucial factor in regulating telo- 
meric folding and state. 

This work also raises the question of why different 
N-termini have evolved for TRF1 and TRF2 during 
Chordate evolution. We show here that these domains 
regulate the ability of TRF to condense telomeric DNA. 
Strikingly, the N-terminal part of TRF2 might facilitate 
heterochromatin formation by binding both ORC1 and 
TERRA (33), suggesting that the N-terminal parts of 
the TRF proteins are crucial to regulate telomeric chro- 
matin condensation both through intrinsic properties and 
interactions with chromatin components. This is expected 
to be crucial for proper organization and dynamics of 
telomeric chromatin during the cell cycle, but it could 
also lead to long-range chromatin interactions between 
telomeres and non-telomeric chromosomal loci. Recent 
studies from our group and that of Zhou Songyang 
(40,41) have revealed the presence of TRF2 outside telo- 
meres, specifically, on centromeric/pericentromeric satel- 
lite DNA and insterstitial telomeric sequences. Often 
located in the proximity of genes or within introns, these 
TRF2 binding sites may be involved in the regulation of 
the corresponding genes and thus participate in the cell 
transcriptional program. In view of the present results, it 
is tempting to speculate that the acquisition of specialized 
N-termini by TRF proteins during the evolution of chord- 
ates may have thus contributed to the establishment of 
new transcriptional regulatory programs. 
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